Proceedings of the National Academy of Sciences — Latest Matching Preprints

1

Where Natural Protein Sequences Stand out From Randomness

Weidmann, L.; Dijkstra, T.; Kohlbacher, O.; Lupas, A. N.

2019-07-20 evolutionary biology 10.1101/706119 medRxiv

Top 0.1%

57.9%

Show abstract

Approaches based on molecular evolution have organized natural proteins into a hierarchy of families, superfamilies, and folds, which are often pictured as islands in a great sea of unrealized and generally non-functional polypeptides. In contrast, approaches based on information theory have substantiated a mostly random scatter of natural proteins in global sequence space. We evaluate these opposing views by analyzing fragments of a given length derived from either a natural dataset or different random models. For this, we compile distances in sequence space between fragments within each dataset and compare the resulting distance distributions between sets. Even for 100-mers, more than 95% of distances can be accounted for by a random sequence model that incorporates the natural amino acid frequency of proteins. When further accounting for the specific residue composition of the respective fragments, which would include biophysical constraints of protein folding, more than 99% of all distances can be modeled. Thus, while the local space surrounding a protein is almost entirely shaped by common descent, the global distribution of proteins in sequence space is close to random, only constrained by divergent evolution through the requirement that all intermediates connecting two forms in evolution must be functional. Significance StatementWhen generating new proteins by evolution or design, can the entire sequence space be used, or do viable sequences mainly occur only in some areas of this space? As a result of divergent evolution, natural proteins mostly form families that occupy local areas of sequence space, suggesting the latter. Theoretical work however indicates that these local areas are highly diffuse and do not dramatically affect the statistics of sequence distribution, such that natural proteins can be considered to effectively cover global space randomly, though extremely sparsely. By comparing the distance distribution of natural sequences to that of various random models, we find that they are indeed distributed largely randomly, provided that the amino acid composition of natural proteins is respected.

2

Evolutionary Sequence and Structural Basis for the Epistatic Origins of Drug Resistance in HIV

Biswas, A.; Choudhuri, I.; Huang, K.; Sun, Q.; Sali, A.; Echeverria, I.; Haldane, A.; Levy, R.; Lyumkis, D.

2025-05-02 biophysics 10.1101/2025.04.30.651576 medRxiv

Top 0.1%

54.7%

Show abstract

The emergence of drug resistance in the human immunodeficiency virus (HIV) remains a formidable challenge to the long-term efficacy of antiretroviral therapy (ART). A growing body of evidence highlights the critical role of epistasis, the dependence of mutational effects on the sequence context, in shaping the fitness landscape of HIV under ART-induced selection pressure. However, the biophysical origins of the epistatic interactions involved in engendering drug-resistance mutations (DRMs) remain unclear. Are the mutational correlations "intrinsic" to the properties of the protein, or do they arise because of drug binding? We use a Potts sequence-covariation statistical energy model built on patient-derived HIV-1 protein sequences to construct computational double mutant cycles that probe pairwise epistasis for all observed mutations across the three major HIV drug-target enzymes. We find that the strongest epistatic effects occur between mutations at residue positions that frequently mutate during the course of ART, termed resistance-associated positions. To investigate the structural origins of the strongest epistatic interactions, we perform [~]100 free energy perturbation molecular dynamics simulations, revealing that the primary contribution to the pairwise epistasis between DRMs arises from cooperative effects on protein stability and folding as an intrinsic consequence of the protein mutational landscape. The results collectively reinforce a mechanism of resistance evolution whereby viruses escape drug pressure by selectively engendering mutations at "intrinsically" coupled sites, allowing them to cooperatively ameliorate fitness detriments incurred by individual DRMs. SignificanceEpistasis refers to the phenomenon where the effect of a mutation on protein structure and function is dependent on the genetic sequence background of the mutation, resulting in the combined effect of mutations being non-additive. Epistasis plays a significant role in the evolution of drug resistance in viruses such as HIV under therapeutic selection pressure. We combine a protein sequence coevolutionary model and molecular dynamics free energy simulations to identify and probe the mechanistic origins of the strongest epistatic interactions connecting HIV drug-resistance mutations. The work establishes a foundation to probe the molecular bases of epistasis and predict the evolution of resistance predicated on the knowledge of epistatic interaction networks.

3

Protein unfolding thermodynamics predict multicomponent phase behavior

Rana, N.; Kodirov, R.; Shakya, A.; King, J. T.

2023-05-26 biophysics 10.1101/2023.05.26.542380 medRxiv

Top 0.1%

53.9%

Show abstract

An increasing number of proteins are known to undergo liquid-liquid phase separation (LLPS), with or without nucleic acids or partner proteins, forming dense liquid-like phases termed biomolecular condensates. This physical phenomenon has been implicated in the existence of cellular membraneless organelles as well as in the formation of pathological protein aggregates in several human diseases. While common structural features of proteins with a propensity to undergo LLPS have been well documented, currently there is no thermodynamic framework capable of predicting the phase behavior of native proteins. Here, we show that two fundamental thermodynamic properties associated with the unfolding of a native protein, change in heat capacity ({Delta}Cunfold) and change in Gibbs free energy ({Delta}Gunfold), are sufficient to predict the formation of multicomponent biomolecular condensates. We find that proteins with small{Delta} Cunfold and{Delta} Gunfold values, which indicate a native state that is thermodynamically similar to the fully unfolded state, promote LLPS. In contrast, proteins with large{Delta} Cunfold and{Delta} Gunfold values promote aggregation. We also demonstrate that the stability of the liquid-like condensate can be predicted from the proximity of a proteins thermodynamic variable values to the phase boundary. This work elucidates a deep connection between single-protein thermodynamics and multicomponent phase behavior, and provides an avenue for predicting pathological droplet-aggregate transition. Significance StatementBiomolecular phase transitions, such as LLPS of proteins and nucleic acids, is emerging as an important concept in understanding the link between dysregulation of membraneless compartmentalization in cells and several human diseases. Currently, analysis of structural features such as intrinsic disorder and charged residue content is the "rule-of-thumb" to predict a proteins propensity to undergo LLPS. There are several instances where this qualitative approach fails, potentially due to not accounting for a proteins native structure. We demonstrate an empirical correlation between the unfolding thermodynamics of native proteins and their phase behavior, which enables a quantitative prediction of multicomponent LLPS. This novel approach, based on single-protein thermodynamics, has the potential to quantitatively predict pathological phase transitions of proteins in degenerative human diseases.

4

Connecting sequence features within the disordered C-terminal linker of B. subtilis FtsZ to function and bacterial cell division

Shinn, M. K.; Cohan, M. C.; Bullock, J. L.; Ruff, K. M.; Levin, P. A.; Pappu, R. V.

2022-06-29 biophysics 10.1101/2022.06.29.498098 medRxiv

Top 0.1%

53.5%

Show abstract

Intrinsically disordered regions (IDRs) can function as autoregulators of folded enzymes to which they are tethered. One example is the bacterial cell division protein, FtsZ. This includes a folded core and a C-terminal tail (CTT) that encompasses a poorly conserved, disordered C-terminal linker (CTL) and a well-conserved 17-residue C-terminal peptide (CT17). Sites for GTPase activity of FtsZs are formed at the interface between GTP binding sites and T7 loops on cores of adjacent subunits within dimers. Here, we explore the basis of autoregulatory functions of the CTT in Bacillus subtilis FtsZ (Bs-FtsZ). Molecular simulations show that the CT17 of Bs- FtsZ makes statistically significant CTL-mediated contacts with the T7 loop. Statistical Coupling Analysis of more than 103 sequences from FtsZ orthologs reveals clear covariation of the T7 loop and the CT17 with most of the core domain whereas the CTL is under independent selection. Despite this, we discover the conservation of non-random sequence patterns within CTLs across orthologs. To test how the non-random patterns of CTLs mediate CTT-core interactions and modulate FtsZ functionalities, we designed Bs-FtsZ variants by altering the patterning of oppositely charged residues within the CTL. Such alterations disrupt the core-CTT interactions, lead to anomalous assembly and inefficient GTP hydrolysis in vitro and protein degradation, aberrant assembly, and disruption of cell division in vivo. Our findings suggest that viable CTLs in FtsZs are likely to be IDRs that encompass non-random, functionally relevant sequence patterns that also preserve three-way covariation of the CT17, the T7 loop, and core domain. Significance StatementZ-ring formation by the protein FtsZ controls cell division in rod-shaped bacteria. The C-terminus of FtsZ encompasses a disordered C-terminal linker (CTL) and a conserved CT17 motif. Both modules are essential for Z-ring formation and proper localization of FtsZ in cells. Previous studies suggested that generic intrinsically disordered regions (IDRs) might be suitable functional replacements for naturally occurring CTLs. Contrary to this suggestion, we find that the sequence-encoded conformational properties of CTLs help mediate autoregulatory interactions between covarying regions within FtsZ. Functional properties of the CTL are encoded via evolutionarily conserved, non-random sequence patterns. Disruption of these patterns impair molecular functions and cellular phenotypes. Our findings have broad implications for discovering functionally consequential sequence features within IDRs of other proteins.

5

Less is more in language production: Shorter sentences contain more informative words

Rezaii, N.; Ren, B.; Quimby, M.; Hochberg, D.; Dickerson, B.

2022-06-03 neurology 10.1101/2022.06.02.22275938 medRxiv

Top 0.1%

52.7%

Show abstract

Agrammatism is characterized by short sentences, the omission of function words, a higher ratio of heavy to light verbs, and a decreased use of verbs relative to nouns. Despite the observation of these phenomena more than two centuries ago, there has been no unifying theory to explain all features of agrammatism. Here, by first examining the language of patients with primary progressive aphasia, we show that the seemingly heterogeneous features of agrammatism can be explained by a process that selects lower frequency words over their higher frequency alternatives in the context of a limitation in sentence production, likely to increase the informational content of sentences. We further show that when healthy speakers are constrained to produce short sentences, features of agrammatism emerge in their language. Finally, we show that these findings instantiate a general property in healthy language production in which shorter sentences are constructed by selecting lower frequency words.

6

Genetic risk scores of disease and mortality capture differences in longevity, economic behavior, and insurance outcomes

Karlsson Linner, R.; Koellinger, P. D.

2020-04-02 health economics 10.1101/2020.03.30.20047290 medRxiv

Top 0.1%

52.2%

Show abstract

Widespread genetic testing for diseases may cause adverse selection, escalating premiums, or discrimination in various insurance markets. Here, without systematically informing study participants of their genetic predisposition, we estimate to what extent genetic data are informative about differences in longevity, health expectations, and economic behavior. We compute measures of genetic liability (polygenic scores) for 27 common diseases and mortality risks in 9,272 participants of the Health and Retirement Study (HRS). Survival analysis suggests that the highest decile of cumulative genetic risk can distinguish a median lifespan up to 4.5 years shorter, a difference that is similar to or larger than that distinguished by conventional actuarial risk factors, including sex. Furthermore, greater genetic liability is associated with less long-term care insurance, among other economic behaviors. We conclude that the rapid developments in genetic epidemiology pose new challenges for regulating consumer genetics and insurance markets, requiring urgent attention from policymakers.

7

Criticality of resting-state EEG predicts perturbational complexity and level of consciousness during anesthesia

Maschke, C.; O'Byrne, J.; Colombo, M. A.; Boly, M.; Gosseries, O.; Laureys, S.; Rosanova, M.; Jerbi, K.; Blain-Moraes, S.

2023-10-31 neuroscience 10.1101/2023.10.26.564247 medRxiv

Top 0.1%

52.2%

Show abstract

1Consciousness has been proposed to be supported by electrophysiological patterns poised at criticality, a dynamical regime which exhibits adaptive computational properties, maximally complex patterns and divergent sensitivity to perturbation. Here, we investigated dynamical properties of the resting-state electroencephalogram of healthy subjects undergoing general anesthesia with propofol, xenon or ketamine. We then studied the relation of these dynamic properties with the perturbational complexity index (PCI), which has shown remarkably high sensitivity in detecting consciousness independent of behavior. All participants were unresponsive under anesthesia, while consciousness was retained only during ketamine anesthesia (in the form of vivid dreams)., enabling an experimental dissociation between unresponsiveness and unconsciousness. We estimated (i) avalanche criticality, (ii) chaoticity, and (iii) criticality-related measures, and found that states of unconsciousness were characterized by a distancing from both the edge of activity propagation and the edge of chaos. We were then able to predict individual subjects PCI (i.e., PCImax) with a mean absolute error below 7%. Our results establish a firm link between the PCI and criticality and provide further evidence for the role of criticality in the emergence of consciousness. 2 Significance StatementComplexity has long been of interest in consciousness science and had a fundamental impact on many of todays theories of consciousness. The perturbational complexity index (PCI) uses the complexity of the brains response to cortical perturbations to quantify the presence of consciousness. We propose criticality as a unifying framework underlying maximal complexity and sensitivity to perturbation in the conscious brain. We demonstrate that criticality measures derived from resting-state electroencephalography can distinguish conscious from unconscious states, using propofol, xenon and ketamine anesthesia, and from these measures we were able to predict the PCI with a mean error below 7%. Our results support the hypothesis that critical brain dynamics are implicated in the emergence of consciousness and may provide new directions for the assessment of consciousness.

8

Atomistic simulation study reveals transduction of mechanical work generated by ATP hydrolysis onto myosin II functional loops

Kurisaki, I.; Higuchi, H.; Tanaka, S.; Suzuki, M.

2026-01-23 biophysics 10.64898/2026.01.22.700696 medRxiv

Top 0.1%

51.6%

Show abstract

Myosin II is a paradigm of biological molecular energy transducers that convert the chemical energy of ATP, via hydrolysis, into mechanical work with high efficiency in living cells. The physicochemical mechanisms underlying energy conversion by myosin II have been extensively investigated. However, a remaining challenge concerns the initial stage of the ATP hydrolysis cycle, specifically the conversion of ATP-to-ADP:Pi, because of technical difficulties in directly and seamlessly observing atomic trajectories during this early step and the subsequent processes. Here, we simulate the consequences of the chemical reaction by switching force field parameters between reactant and product systems within classical molecular dynamics simulations. We consider two possible ATP hydrolysis mechanisms in which either singly protonated (Pi{superscript 2}-) or doubly protonated (Pi-) inorganic phosphate is produced. Our results indicate that, in both Pi-generating processes, the kinetic energy supplied by conversion of ATP-to-ADP:Pi increases only transiently, whereas myosin functional loops store several kcal/mol of potential energy. Meanwhile, the amount of potential energy stored in the Pi{superscript 2}--producing reaction is approximately five-fold larger than that in the Pi--producing reaction. Our analysis indicates that this difference emerges when ATP-derived mechanical work is transmitted into the functional loops via rearrangement of intermolecular hydrogen bonds with the hydrolysis products. Notably, these steric interactions remain stably established even when the kinetic energy input associated with ATP-to-ADP:Pi conversion is actively quenched. We therefore propose that transient storage of ATP-derived mechanical work as atomic-scale conformational strain within myosin molecules constitutes a critical step for efficient conversion of ATP chemical energy into mechanical work under conditions of intracellular thermal noise. SignificanceMyosin II converts chemical energy released by ATP hydrolysis into mechanical work with exceptionally high efficiency, even in the presence of substantial thermal fluctuations in living systems. It has long been proposed that release of inorganic phosphate (Pi) from myosin II is a critical step for efficient energy transfer. However, this prevailing view has been challenged by recent state-of-the-art experimental observations, necessitating a reexamination of the chemical steps responsible for ATP-derived energy storage. Here, we employ atomistic molecular dynamics simulations specifically designed to capture key aspects of ATP hydrolysis. We demonstrate that ATP hydrolysis increases the potential energy of myosin II functional loops while Pi remains bound, and that this increase occurs independently of heat release associated with the reaction. Together, these findings identify an alternative mechanism for energy retention and emphasize the intrinsic robustness of myosin II as a highly efficient molecular motor.

9

Evolution of research topics and paradigms in plant sciences

Shiu, S.-H.; Lehti-Shiu, M. D.

2023-10-03 plant biology 10.1101/2023.10.02.560457 medRxiv

Top 0.1%

51.5%

Show abstract

Scientific advances due to conceptual or technological innovations can be revealed by examining how research topics have evolved. But such topical evolution is difficult to uncover and quantify because of the large body of literature and the needs of expert knowledge from a wide range of areas in any field. Here we used machine learning and language models to classify plant science citations into topics representing interconnected, evolving subfields. The changes in prevalence of topical records over the last 50 years reflect major research paradigm shifts and recent radiation of new topics, as well as turnovers of model species and vastly different plant science research trajectories among countries. Our approaches readily summarize the topical diversity and evolution of a scientific field with hundreds of thousands of relevant papers, and they can be applied broadly to other fields. Significance statementChanges in scientific paradigms are foundational for the advancement of science, but such changes are difficult to summarize, quantify, and illustrate. These challenges are exacerbated by the rapid, exponential growth of literature. Applying a combination of machine learning and language modeling to hundreds of thousands of published abstracts, we demonstrate that a scientific field (i.e., plant science) can be summarized as interconnected subfields evolving from one another. We also reveal insights into major research trends and the rise and decline in the use of model organisms in different countries. Our study demonstrates how artificial intelligence and language models can be broadly applied to understand scientific advances that inform science policy and funding decisions.

10

Multiscale Analysis of PNPLA2 and PNPLA3 Membrane Targeting

Kumar, A.; Teskey, G.; Mottillo, E.; Huang, Y.-m. M.

2026-02-14 biophysics 10.64898/2026.02.12.705593 medRxiv

Top 0.1%

51.4%

Show abstract

Lipid droplets (LDs) are dynamic organelles that regulate cellular lipid storage and mobilization through the coordinated action of LD-associated proteins. Patatin-like phospholipase domain-containing proteins PNPLA2 (ATGL) and PNPLA3 are central regulators of lipid metabolism, yet the molecular mechanisms underlying their membrane targeting and distinct enzymatic activities remain poorly understood. Here, we combine coarse-grained and all-atom molecular dynamics simulations with enhanced sampling to investigate how PNPLA2 and PNPLA3 associate with endoplasmic reticulum (ER) and LD membranes. Despite sharing a conserved N-terminal patatin domain, the two proteins exhibit distinct membrane-binding modes driven by divergent C-terminal amphipathic helices. In both proteins, membrane association is mediated primarily by deep insertion of C-terminal helices, while the patatin domain provides surface contact. PNPLA2 forms a deeply embedded U-shaped helical bundle on LDs that induce pronounced membrane curvature and promote opening of the catalytic dyad, consistent with its high triglyceride lipase activity. In contrast, PNPLA3 engages membranes through a more flexible helical arrangement that maintains a compact catalytic geometry and limits substrate accessibility. Membrane composition further modulates these interactions and leads to protein-specific lipid redistribution and curvature remodeling. Fluorescence microscopy experiments validate the computational predictions and demonstrate that mutation of a single arginine residue within the C-terminal region is sufficient to reduce LD targeting of both proteins. These results establish a mechanistic connection between membrane binding, conformational plasticity, and catalytic regulation in PNPLA2 and PNPLA3. Our work provides molecular insights into how lipid environments tune the function of LD-associated enzymes. Author SummaryLDs are essential cellular organelles that control how fats are stored and released, a process that relies on the precise recruitment and regulation of lipid-metabolizing enzymes. Our work focuses on two closely related enzymes, PNPLA2 (ATGL) and PNPLA3, which play central but distinct roles in lipid metabolism and metabolic diseases. Using a combination of multiscale modeling simulations and fluorescence microscopy, we examine how these proteins recognize and bind to ER and LD membranes. Although PNPLA2 and PNPLA3 share a conserved catalytic core, we show that they interact with membranes in different ways due to differences in their C-terminal amphipathic helices. We find that PNPLA2 forms a deeply embedded helical arrangement that reshapes the membrane and promotes access to its catalytic site, which explains why it typically shows strong lipase activity. In contrast, PNPLA3 adopts a more compact membrane-bound catalytic geometry that limits substrate access and enzymatic activity. We further applied fluorescence microscopy to experimentally validate the computational predictions. The results show that mutation of a single arginine residue within the membrane-binding helix reduces LD targeting. These findings reveal how membrane association and protein conformational dynamics jointly regulate catalytic accessibility and activity.

11

Identification of NLR-associated amyloid signaling motifs in filamentous bacteria

Dyrka, W.; Coustou, V.; Daskalov, A.; Lends, A.; Bardin, T.; Berbon, M.; Kauffmann, B.; Blancard, C.; Salin, B.; Loquet, A.; Saupe, S. J.

2020-01-06 microbiology 10.1101/2020.01.06.895854 medRxiv

Top 0.1%

51.3%

Show abstract

NLRs (Nod-like receptors) are intracellular receptors regulating immunity, symbiosis, non-self recognition and programmed cell death in animals, plants and fungi. Several fungal NLRs employ amyloid signaling motifs to activate downstream cell-death inducing proteins. Herein, we identify in Archaea and Bacteria, short sequence motifs that occur in the same genomic context as fungal amyloid signaling motifs. We identify 10 families of bacterial amyloid signaling sequences (we term BASS), one of which (BASS3) is related to mammalian RHIM and fungal PP amyloid motifs. We find that BASS motifs occur specifically in bacteria forming multicellular structures (mainly in Actinobacteria and Cyanobacteria). We analyze experimentally a subset of these motifs and find that they behave as prion forming domains when expressed in a fungal model. All tested bacterial motifs also formed fibrils in vitro. We analyze by solid-state NMR and X-ray diffraction, the amyloid state of a protein from Streptomyces coelicolor bearing the most common BASS1 motif and find that it forms highly ordered non-polymorphic amyloid fibrils. This work expands the paradigm of amyloid signaling to prokaryotes and underlies its relation to multicellularity.

12

The unusual chloroplast ATP synthase redox domain ensures enzyme activity and elevates the electrochemical proton gradient in dark-adapted Chlamydomonas reinhardtii.

Lebok, L.; Buchert, F.

2022-11-09 plant biology 10.1101/2022.11.08.515721 medRxiv

Top 0.1%

50.8%

Show abstract

To maintain CO2 fixation in the Calvin Benson-Bassham cycle, multi-step regulation of the chloroplast ATP synthase (CF1Fo) is crucial to balance the ATP output of photosynthesis with protection of the apparatus. A well-studied mechanism is thiol modulation; a light/dark regulation through reversible cleavage of a disulfide in the CF1Fo {gamma}-subunit. The disulfide hampers ATP synthesis and hydrolysis reactions in dark-adapted CF1Fo from land plants by increasing the required transmembrane electrochemical proton gradient [Formula]. Here, we show in Chlamydomonas reinhardtii that algal CF1Fo is differently regulated in vivo. A specific hairpin structure in the {gamma}-subunit redox domain disconnects activity regulation from disulfide formation in the dark. Electrochromic shift measurements suggested that the hairpin kept wild type CF1Fo active whereas the enzyme was switched off in algal mutant cells expressing a plant-like hairpin structure. The hairpin segment swap resulted in an elevated [Formula] threshold to activate plant-like CF1Fo, increased by [~]1.4 photosystem (PS) I charge separations. The resulting dark-equilibrated [Formula] dropped in the mutants by [~]2.7 PSI charge separation equivalents. Photobioreactor experiments showed no phenotypes in autotrophic aerated mutant cultures. In contrast, chlorophyll fluorescence measurements under heterotrophic dark conditions point to a reduced plastoquinone pool in cells with the plant-like CF1Fo as the result of bioenergetic bottlenecks. Our results suggest that the lifestyle of Chlamydomonas reinhardtii requires a specific CF1Fo dark regulation that partakes in metabolic coupling between the chloroplast and acetate-fueled mitochondria. Significance StatementThe microalga Chlamydomonas reinhardtii exhibits a non-classical thiol modulation of the chloroplast ATP synthase for the sake of metabolic flexibility. The redox switch, although established, was functionally disconnected in vivo thanks to a hairpin segment in the {gamma}-subunit redox domain. Dark enzymatic activity was prevented by replacing the algal hairpin segment with the one from land plants, restoring a classical thiol modulation pattern. Thereby, ATP was saved at the expense of thylakoid membrane energization levels in the dark. However, metabolism was impaired upon silencing dark ATPase activity, indicating that a functional disconnect from the redox switch represents an adaptation to different ecological niches.

13

Ploidy alters root anatomy and shapes the evolution of wheat polyploids

Sidhu, J. S.; Gill, H. S.; Walker, S.; Rangarajan, H.; Lopez-Valdivia, I.; Singh, M.; Sawers, R. J.; Sehgal, S.; Lynch, J.

2025-07-11 plant biology 10.1101/2025.07.11.663263 medRxiv

Top 0.1%

50.8%

Show abstract

Polyploidization played a crucial role in crop domestication and modern agriculture. While increased cell size in polyploids is known to enhance plant biomass and vigor, its impact on soil exploration remains poorly understood. Using wheat as a model, we identify a ploidy-induced belowground domestication syndrome, characterized by (a) increased root cortical cell size reducing root respiration, nitrogen content, and phosphorus content; (b) enlarged metaxylem vessels, increasing axial hydraulic conductance; and (c) blunter root tips, limiting penetration ability in compacted soils. Our empirical and in silico experiments show that reduced root respiration and reduced cellular nutrient content in wheat polyploids improved nutrient use and acquisition efficiency under suboptimal nitrogen and phosphorus availability. These adaptations would have been advantageous in nutrient-depleted agroecosystems of the Pre-Pottery Neolithic B (PPNB) Fertile Crescent, where continuous cultivation depleted soil fertility over time. Functional-structural modeling indicates that larger cortical cells in wheat polyploids increase vacuolar occupancy, reducing root metabolic costs. Enhanced axial hydraulic conducta nce may have improved water transport, an advantage in irrigated PPNB agroecosystems. However, polyploids have blunter root tips, which reduces their penetration ability in compacted soils, making them less suited for native soils with greater mechanical impedance. We propose that root anatomical changes driven by ploidy played an important role in adaptations of wheat domesticates to PPNB agriculture. One-Sentence SummaryPolyploidy induced changes in root anatomy may have improved adaptation of wheat to Neolithic agroecosystems.

14

When is microbial cross-feeding evolutionarily stable?

Lopez, J. A.; Liu, B.; Li, Z.; Donia, M. S.; Wingreen, N. S.

2025-05-17 microbiology 10.1101/2025.05.16.654511 medRxiv

Top 0.1%

50.8%

Show abstract

Cross-feeding, a phenomenon in which organisms share metabolites, is frequently observed in microbial communities across the natural world. One of the most common forms is waste-product cross-feeding, a unidirectional interaction in which the waste products of one microbe support the growth of another. Despite its ubiquity, it is not well-understood why waste-product cross-feeding persists when a single organism could in principle perform both the producer and consumer role. To address this question, we first analyze cross-feeding evolution in a minimal model of microbial metabolism. The model describes multi-step extraction of energy from a substrate in a simple but thermodynamically correct formulation. Surprisingly, we find that cross-feeding is never evolutionarily stable in this model. By analyzing models with more complex growth functions, we identify a novel mechanism for the evolutionary stability of waste-product cross-feeding, namely, generalized intracellular metabolite toxicity. Such toxicity arises because, in excess, the same intracellular metabolites that cells require for metabolism can be detrimental to growth (e.g., due to osmotic stress). We show that some but not all forms of such toxicity can lead to evolutionarily stable consortia of microbes that cross-feed waste products. This stability results from the potential of such consortia to divide the burden of toxic metabolites among a larger population, allowing them to perform their collective metabolism more efficiently than non-cross-feeders. More generally, we predict that growth penalties that scale nonlinearly with intracellular metabolite levels promote cross-feeding. We find that this mechanism for cross-feeding evolutionary stability implies nontrivial population dynamics, such as a discontinuity in population biomass at the onset of cross-feeding. Significance statementThe chemical reactions performed by microbes have large impacts on our world: from nitrification within the nitrogen cycle to the breakdown of fiber in animal digestive tracts. A striking commonality in many of these processes is that the chemical reactions are a collective effort, with the complete reaction subdivided between many microbes. This phenomenon is known as cross-feeding, and its origins are poorly understood. Understanding the eco-evolutionary forces promoting cross-feeding have the potential to not only enhance our understanding of natural ecosystems, but also improve our ability to engineer such distributed reactions in biotechnology. Here, we develop mathematical theory for the evolution of a common type of cross-feeding and provide predictions for what metabolic and environmental conditions promote this behavior.

15

Rules for designing protein fold switches and their implications for the folding code

Chen, Y.; He, Y.; Ruan, B.; Choi, E. J.; Chen, Y.; Motabar, D.; Soloman, T.; Simmerman, R.; Kauffman, T.; Gallagher, D. T.; Orban, J.; BRYAN, P. N.

2021-05-18 biophysics 10.1101/2021.05.18.444643 medRxiv

Top 0.1%

50.3%

Show abstract

We have engineered switches between the three most common small folds, 3, 4{beta}+, and /{beta}-plait, referred to here as A, B, and S, respectively. Mutations were introduced into the natural S protein until sequences were created that have a stable S-fold in their longer ([~]90 amino acid) form and have an alternative fold (either A or B) in their shorter (56 amino acid) form. Five sequence pairs were designed and key structures were determined using NMR spectroscopy. Each protein pair is 100% identical in the 56 amino acid region of overlap. Several rules for engineering switches emerged. First, designing one sequence with good native state interactions in two folds requires care but is feasible. Once this condition is met, fold populations are determined by the stability of the embedded A- or B-fold relative to the S-fold and the conformational propensities of the ends that are generated in the switch to the embedded fold. If the stabilities of the embedded fold and the longer fold are similar, conformation is highly sensitive to mutation so that even a single amino acid substitution can radically shift the population to the alternative fold. The results provide insight into why dimorphic sequences can be engineered and sometimes exist in nature, while most natural protein sequences populate single folds. Proteins may evolve toward unique folds because dimorphic sequences generate interactions that destabilize and can produce aberrant functions. Thus, two-state behavior may result from natures negative design rather than being an inherent property of the folding code. Significance StatementWe establish general rules for designing protein fold switches by engineering dimorphic sequences that link the three most common small folds. The fact that switches can be engineered in arbitrary and common protein folds, sheds light on several important questions: 1) What is the generality of fold switching? 2) What types of folds are amenable to switching? 3) What properties are shared by sequences that can fold into two completely different structures? This work has implications for understanding how amino acid sequence encodes structure, how proteins evolve, how mutation is related to disease, and how function is annotated to sequences of unknown structure. ClassificationBiological Sciences: Biochemistry; Physical Sciences: Biophysics and Computational Biology

16

Evidence for Behavioral Autorepression in Covid-19 Epidemiological Dynamics

Lewis, D. D.; Pablo, M.; Chen, X.; Simpson, M. L.; Weinberger, L.

2024-06-09 epidemiology 10.1101/2024.06.07.24308626 medRxiv

Top 0.1%

49.1%

Show abstract

It has long been hypothesized that behavioral reactions to epidemic severity autoregulate infection dynamics, for example when susceptible individuals self-sequester based on perceived levels of circulating disease. However, evidence for such behavioral autorepression has remained elusive, and its presence could significantly affect epidemic forecasting and interventions. Here, we analyzed early COVID-19 dynamics at 708 locations over three epidemiological scales (96 countries, 50 US states, and 562 US counties). Signatures of behavioral autorepression were identified through: (i) a counterintuitive mobility-death correlation, (ii) fluctuation-magnitude analysis, and (iii) dynamics of SARS-CoV-2 infection waves. These data enabled calculation of the average behavioral-autorepression strength (i.e., negative feedback gain) across different populations. Surprisingly, incorporating behavioral autorepression into conventional models was required to accurately forecast COVID-19 mortality. Models also predicted that the strength of behavioral autorepression has the potential to alter the efficacy of non-pharmaceutical interventions. Overall, these results provide evidence for the long-hypothesized existence of behavioral autorepression, which could improve epidemic forecasting and enable more effective application of non-pharmaceutical interventions during future epidemics. SignificanceChallenges with epidemiological forecasting during the COVID-19 pandemic suggested gaps in underlying model architecture. One long-held hypothesis, typically omitted from conventional models due to lack of empirical evidence, is that human behaviors lead to intrinsic negative autoregulation of epidemics (termed behavioral autorepression). This omission substantially alters model forecasts. Here, we provide independent lines of evidence for behavioral autorepression during the COVID-19 pandemic, demonstrate that it is sufficient to explain counterintuitive data on shutdowns, and provides a mechanistic explanation of why early shutdowns were more effective than delayed, high-intensity shutdowns. We empirically measure autorepression strength, and show that incorporating autorepression dramatically improves epidemiological forecasting. The autorepression phenomenon suggests that tailoring interventions to specific populations may be warranted.

17

Origin of the type I antifreeze gene in flounders in response to Cenozoic climate change

Graham, L. A.; Gauthier, S. Y.; Davies, P. L.

2021-09-24 evolutionary biology 10.1101/2021.09.21.461085 medRxiv

Top 0.1%

48.2%

Show abstract

Antifreeze proteins (AFPs) inhibit ice growth within fish and protect them from freezing in icy seawater. Alanine-rich, alpha-helical AFPs (type I) have independently (convergently) evolved in four branches of fishes, one of which is a subsection of the righteye flounders. The origin of this gene family has been elucidated by sequencing two loci from a starry flounder, Platichthys stellatus, collected off Vancouver Island, British Columbia. The first locus had two alleles that demonstrated the plasticity of the AFP gene family, one encoding 33 AFPs and the other allele only four. In the closely related Pacific halibut, this locus encodes multiple Gig2 (antiviral) proteins, but in the starry flounder, the Gig2 genes were found at a second locus due to a lineage-specific duplication event. An ancestral Gig2 gave rise to a 3-kDa "skin" AFP isoform, encoding three Ala-rich 11-a.a. repeats, that is expressed in skin and other peripheral tissues. Subsequent gene duplications, followed by internal duplications of the 11 a.a. repeat and the gain of a signal sequence, gave rise to circulating AFP isoforms. One of these, the "hyperactive" 32-kDa Maxi likely underwent a contraction to a shorter 3.3-kDa "liver" isoform. Present day starry flounders found in Pacific Rim coastal waters from California to Alaska show a positive correlation between latitude and AFP gene dosage, with the shorter allele being more prevalent at lower latitudes. This study conclusively demonstrates that the flounder AFP arose from the Gig2 gene, so it is evolutionarily unrelated to the three other classes of type I AFPs from non-flounders. Additionally, this gene arose and underwent amplification coincident with the onset of ocean cooling during the Cenozoic ice ages.

18

A redox switch allows binding of ferrous and ferric ions in the cyanobacterial iron binding protein FutA from Prochlorococcus

Bolton, R.; Machelett, M. M.; Stubbs, J.; Axford, D.; Caramello, N.; Catapano, L.; Maly, M.; Rodrigues, M. J.; Cordery, C.; Tizzard, G. J.; MacMillan, F.; Engilberge, S.; von Stetten, D.; Tosha, T.; Sugimoto, H.; Worrall, J. A.; Webb, J. S.; Zubkov, M.; Coles, S.; Mathieu, E.; Steiner, R. A.; Murshudov, G. N.; Schrader, T. E.; Orville, A. M.; Royant, A.; Evans, G.; Hough, M. A.; Owen, R.; Tews, I.

2023-05-23 biophysics 10.1101/2023.05.23.541926 medRxiv

Top 0.1%

44.5%

Show abstract

The marine cyanobacterium Prochlorococcus is a main contributor to global photosynthesis, whilst being limited by iron availability. Cyanobacterial genomes typically encode two different types of FutA iron binding proteins: periplasmic FutA2 ABC transporter subunits bind Fe(III), while cytosolic FutA1 binds Fe(II). Owing to their small size and their economized genome Prochlorococcus ecotypes typically possess a single futA gene. How the encoded FutA protein might bind different Fe oxidation states was previously unknown. Here we use structural biology techniques at room temperature to probe the dynamic behavior of FutA. Neutron diffraction confirmed four negatively charged tyrosinates, that together with a neutral water molecule coordinate iron in trigonal bipyramidal geometry. Positioning of the positively charged Arg103 side chain in the second coordination shell yields an overall charge-neutral Fe(III) binding state in structures determined by neutron diffraction and serial femtosecond crystallography. Conventional rotation X-ray crystallography using a home source revealed X-ray induced photoreduction of the iron center with observation of the Fe(II) binding state; here, an additional positioning of the Arg203 side chain in the second coordination shell maintained an overall charge neutral Fe(II) binding site. Dose series using serial synchrotron crystallography and an XFEL X-ray pump-probe approach capture the transition between Fe(III) and Fe(II) states, revealing how Arg203 operates as a switch to accommodate the different iron oxidation states. This switching ability of the Prochlorococcus FutA protein may reflect ecological adaptation by genome streamlining and loss of specialized FutA proteins. Significance StatementOceanic primary production by marine cyanobacteria is a main contributor to carbon and nitrogen fixation. Prochlorococcus is the most abundant photosynthetic organism on Earth, with an annual carbon fixation comparable to the net global primary production from agriculture. Its remarkable ecological success is based on the ability to thrive in low nutrient waters. To manage iron limitation, Prochlorococcus possesses the FutA protein for iron uptake and homeostasis. We reveal a molecular switch in the FutA protein that allows it to accommodate binding of iron in either the Fe(III) or Fe(II) state using structural biology techniques at room temperature and provide a plausible mechanism for iron binding promiscuity.

19

Seasonal forcing and waning immunity drive the sub-annual periodicity of the COVID-19 epidemic

Rubin, I. N.; Bushman, M.; Lipsitch, M.; Hanage, W. P.

2025-03-06 epidemiology 10.1101/2025.03.05.25323464 medRxiv

Top 0.1%

44.1%

Show abstract

Seasonal trends in infectious diseases are shaped by climatic and social factors, with many respiratory viruses peaking in winter. However, the seasonality of COVID-19 remains in dispute, with significant waves of cases across the United States occurring in both winter and summer. Using wavelet analysis of COVID-19 cases, we find that the periodicity of epidemic COVID-19 varies markedly across the U.S. and correlates with winter temperatures, indicating seasonal forcing. However, seasonal forcing alone cannot explain the pattern of multiple waves per year that has been so disruptive and unique to COVID-19. Using a modified SIRS model that allows specification of the tempo of waning immunity, we show that specific forms of non-durable immunity can sufficiently explain the sub-annual waves characteristic of the COVID-19 epidemic.

20

Cerebral cortical communication overshadows computational energy-use, but these combine to predict synapse number

Levy, W. B.; Calvert, V. G.

2021-02-16 neuroscience 10.1101/2021.02.15.431272 medRxiv

Top 0.1%

44.0%

Show abstract

Darwinian evolution tends to produce energy-efficient outcomes. On the other hand, energy limits computation, be it neural and probabilistic or digital and logical. Taking a particular energy-efficient viewpoint, we define neural computation and make use of an energy-constrained, computational function. This function can be optimized over a variable that is proportional to the number of synapses per neuron. This function also implies a specific distinction between ATP-consuming processes, especially computation per se vs the communication processes including action potentials and transmitter release. Thus to apply this mathematical function requires an energy audit with a partitioning of energy consumption that differs from earlier work. The audit points out that, rather than the oft-quoted 20 watts of glucose available to the brain (1, 2), the fraction partitioned to cortical computation is only 0.1 watts of ATP. On the other hand at 3.5 watts, long-distance communication costs are 35-fold greater. Other novel quantifications include (i) a finding that the biological vs ideal values of neural computational efficiency differ by a factor of 108 and (ii) two predictions of N, the number of synaptic transmissions needed to fire a neuron (2500 vs 2000). Significance StatementEngineers hold up the human brain as a low energy form of computation. However from the simplest physical viewpoint, a neurons computation cost is remarkably larger than the best possible bits/J - off by a factor of 108. Here we explicate, in the context of energy consumption, a definition of neural computation that is optimal given explicit constraints. The plausibility of this definition as Natures perspective is supported by an energy-audit of the human brain. The audit itself requires certain novel perspectives and calculations revealing that communication costs are 35-fold computational costs.